76 research outputs found

    A comparative evaluation of sequence classification programs

    Get PDF
    Background: A fundamental problem in modern genomics is to taxonomically or functionally classify DNA sequence fragments derived from environmental sampling (i.e., metagenomics). Several different methods have been proposed for doing this effectively and efficiently, and many have been implemented in software. In addition to varying their basic algorithmic approach to classification, some methods screen sequence reads for ’barcoding genes’ like 16S rRNA, or various types of protein-coding genes. Due to the sheer number and complexity of methods, it can be difficult for a researcher to choose one that is well-suited for a particular analysis. Results: We divided the very large number of programs that have been released in recent years for solving the sequence classification problem into three main categories based on the general algorithm they use to compare a query sequence against a database of sequences. We also evaluated the performance of the leading programs in each category on data sets whose taxonomic and functional composition is known. Conclusions: We found significant variability in classification accuracy, precision, and resource consumption of sequence classification programs when used to analyze various metagenomics data sets. However, we observe some general trends and patterns that will be useful to researchers who use sequence classification programs.https://doi.org/10.1186/1471-2105-13-9

    Phylogeny of Cladobranchia (Gastropoda: Nudibranchia): a total evidence analysis using DNA sequence data from public databases

    Get PDF
    Cladobranchia is a clade of charismatic and exclusively marine slugs (Gastropoda: Nudibranchia). Though Cladobranchia and its sister taxon, Anthobranchia, have been supported by molecular data, little resolution among the higher-level groups within these two clades has emerged from previous analyses. Cladobranchia is traditionally divided into three taxa (Dendronotida, Euarminida, and Aeolidida), none of which have been supported by molecular phylogenetic studies. Reconstructions of the evolutionary relationships within Cladobranchia have resulted in poorly supported phylogenies, rife with polytomies and non-monophyletic groups contradicting previous taxonomic hypotheses. In this study, we present a working hypothesis for the evolutionary history of Cladobranchia, utilizing publicly available data that have been generated since the last attempt at a detailed phylogeny for this group (we include approximately 200 more taxa and a total of five genes). Our results resolve Cladobranchia as monophyletic and provide support for a small proportion of genera and families, but it is clear that the presently available data are insufficient to provide a robust and well-resolved phylogeny of these taxa as a whole

    Prey preference follows phylogeny: evolutionary dietary patterns within the marine gastropod group Cladobranchia (Gastropoda: Heterobranchia: Nudibranchia)

    Get PDF
    The impact of predator-prey interactions on the evolution of many marine invertebrates is poorly understood. Since barriers to genetic exchange are less obvious in the marine realm than in terrestrial or freshwater systems, non-allopatric divergence may play a fundamental role in the generation of biodiversity. In this context, shifts between major prey types could constitute important factors explaining the biodiversity of marine taxa, particularly in groups with highly specialized diets. However, the scarcity of marine specialized consumers for which reliable phylogenies exist hampers attempts to test the role of trophic specialization in evolution. In this study, RNA-Seq data is used to produce a phylogeny of Cladobranchia, a group of marine invertebrates that feed on a diverse array of prey taxa but mostly specialize on cnidarians. The broad range of prey type preferences allegedly present in two major groups within Cladobranchia suggest that prey type shifts are relatively common over evolutionary timescales. In the present study, we generated a well-supported phylogeny of the major lineages within Cladobranchia using RNA-Seq data, and used ancestral state reconstruction analyses to better understand the evolution of prey preference. These analyses answered several fundamental questions regarding the evolutionary relationships within Cladobranchia, including support for a clade of species from Arminidae as sister to Tritoniidae (which both preferentially prey on Octocorallia). Ancestral state reconstruction analyses supported a cladobranchian ancestor with a preference for Hydrozoa and show that the few transitions identified only occur from lineages that prey on Hydrozoa to those that feed on other types of prey. There is strong phylogenetic correlation with prey preference within Cladobranchia, suggesting that prey type specialization within this group has inertia. Shifts between different types of prey have occurred rarely throughout the evolution of Cladobranchia, indicating that this may not have been an important driver of the diversity within this group.https://doi.org/10.1186/s12862-017-1066-

    Toward reconstructing the evolution of advanced moths and butterflies (Lepidoptera: Ditrysia): an initial molecular study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the mega-diverse insect order Lepidoptera (butterflies and moths; 165,000 described species), deeper relationships are little understood within the clade Ditrysia, to which 98% of the species belong. To begin addressing this problem, we tested the ability of five protein-coding nuclear genes (6.7 kb total), and character subsets therein, to resolve relationships among 123 species representing 27 (of 33) superfamilies and 55 (of 100) families of Ditrysia under maximum likelihood analysis.</p> <p>Results</p> <p>Our trees show broad concordance with previous morphological hypotheses of ditrysian phylogeny, although most relationships among superfamilies are weakly supported. There are also notable surprises, such as a consistently closer relationship of Pyraloidea than of butterflies to most Macrolepidoptera. Monophyly is significantly rejected by one or more character sets for the putative clades Macrolepidoptera as currently defined (<it>P </it>< 0.05) and Macrolepidoptera excluding Noctuoidea and Bombycoidea sensu lato (<it>P </it>≤ 0.005), and nearly so for the superfamily Drepanoidea as currently defined (<it>P </it>< 0.08). Superfamilies are typically recovered or nearly so, but usually without strong support. Relationships within superfamilies and families, however, are often robustly resolved. We provide some of the first strong molecular evidence on deeper splits within Pyraloidea, Tortricoidea, Geometroidea, Noctuoidea and others.</p> <p>Separate analyses of mostly synonymous versus non-synonymous character sets revealed notable differences (though not strong conflict), including a marked influence of compositional heterogeneity on apparent signal in the third codon position (nt3). As available model partitioning methods cannot correct for this variation, we assessed overall phylogeny resolution through separate examination of trees from each character set. Exploration of "tree space" with GARLI, using grid computing, showed that hundreds of searches are typically needed to find the best-feasible phylogeny estimate for these data.</p> <p>Conclusion</p> <p>Our results (a) corroborate the broad outlines of the current working phylogenetic hypothesis for Ditrysia, (b) demonstrate that some prominent features of that hypothesis, including the position of the butterflies, need revision, and (c) resolve the majority of family and subfamily relationships within superfamilies as thus far sampled. Much further gene and taxon sampling will be needed, however, to strongly resolve individual deeper nodes.</p

    Three-dimensional structure of a viral genome-delivery portal vertex.

    Get PDF
    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-Å-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a ~1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a ~200-Å-long α-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell

    Tetracosahexaenoylethanolamide, a novel -acylethanolamide, is elevated in ischemia and increases neuronal output.

    Get PDF
    -acylethanolamines (NAEs) are endogenous lipid-signaling molecules derived from fatty acids that regulate numerous biological functions, including in the brain. Interestingly, NAEs are elevated in the absence of fatty acid amide hydrolase (FAAH) and following CO-induced ischemia/hypercapnia, suggesting a neuroprotective response. Tetracosahexaenoic acid (THA) is a product and precursor to DHA; however, the NAE product, tetracosahexaenoylethanolamide (THEA), has never been reported. Presently, THEA was chemically synthesized as an authentic standard to confirm THEA presence in biological tissues. Whole brains were collected and analyzed for unesterified THA, total THA, and THEA in wild-type and FAAH-KO mice that were euthanized by either head-focused microwave fixation, CO + microwave, or CO only. PPAR activity by transient transfection assay and ex vivo neuronal output in medium spiny neurons (MSNs) of the nucleus accumbens by patch clamp electrophysiology were determined following THEA exposure. THEA in the wild-type mice was nearly doubled ( 0.05) transcriptional activity of PPARs relative to control, but 100 nM of THEA increased ( < 0.001) neuronal output in MSNs of the nucleus accumbens. Here were identify a novel NAE, THEA, in the brain that is elevated upon ischemia/hypercapnia and by KO of the FAAH enzyme. While THEA did not activate PPAR, it augmented the excitability of MSNs in the nucleus accumbens. Overall, our results suggest that THEA is a novel NAE that is produced in the brain upon ischemia/hypercapnia and regulates neuronal excitation

    Pan-genome and phylogeny of Bacillus cereus sensu lato

    No full text
    Abstract Background Bacillus cereus sensu lato (s. l.) is an ecologically diverse bacterial group of medical and agricultural significance. In this study, I use publicly available genomes and novel bioinformatic workflows to characterize the B. cereus s. l. pan-genome and perform the largest phylogenetic and population genetic analyses of this group to date in terms of the number of genes and taxa included. With these fundamental data in hand, I identify genes associated with particular phenotypic traits (i.e., “pan-GWAS” analysis), and quantify the degree to which taxa sharing common attributes are phylogenetically clustered. Methods A rapid k-mer based approach (Mash) was used to create reduced representations of selected Bacillus genomes, and a fast distance-based phylogenetic analysis of this data (FastME) was performed to determine which species should be included in B. cereus s. l. The complete genomes of eight B. cereus s. l. species were annotated de novo with Prokka, and these annotations were used by Roary to produce the B. cereus s. l. pan-genome. Scoary was used to associate gene presence and absence patterns with various phenotypes. The orthologous protein sequence clusters produced by Roary were filtered and used to build HaMStR databases of gene models that were used in turn to construct phylogenetic data matrices. Phylogenetic analyses used RAxML, DendroPy, ClonalFrameML, PAUP*, and SplitsTree. Bayesian model-based population genetic analysis assigned taxa to clusters using hierBAPS. The genealogical sorting index was used to quantify the phylogenetic clustering of taxa sharing common attributes. Results The B. cereus s. l. pan-genome currently consists of ≈60,000 genes, ≈600 of which are “core” (common to at least 99% of taxa sampled). Pan-GWAS analysis revealed genes associated with phenotypes such as isolation source, oxygen requirement, and ability to cause diseases such as anthrax or food poisoning. Extensive phylogenetic analyses using an unprecedented amount of data produced phylogenies that were largely concordant with each other and with previous studies. Phylogenetic support as measured by bootstrap probabilities increased markedly when all suitable pan-genome data was included in phylogenetic analyses, as opposed to when only core genes were used. Bayesian population genetic analysis recommended subdividing the three major clades of B. cereus s. l. into nine clusters. Taxa sharing common traits and species designations exhibited varying degrees of phylogenetic clustering. Conclusions All phylogenetic analyses recapitulated two previously used classification systems, and taxa were consistently assigned to the same major clade and group. By including accessory genes from the pan-genome in the phylogenetic analyses, I produced an exceptionally well-supported phylogeny of 114 complete B. cereus s. l. genomes. The best-performing methods were used to produce a phylogeny of all 498 publicly available B. cereus s. l. genomes, which was in turn used to compare three different classification systems and to test the monophyly status of various B. cereus s. l. species. The majority of the methodology used in this study is generic and could be leveraged to produce pan-genome estimates and similarly robust phylogenetic hypotheses for other bacterial groups
    corecore